AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Multimodal document retrieval

# Multimodal document retrieval

Holo1 3B GGUF
Other
Holo1-3B is a multimodal model based on the Transformer architecture, focusing on visual document retrieval tasks and performing excellently in the WebVoyager benchmark test, balancing accuracy and cost.
Image-to-Text Transformers English
H
Mungert
583
0
Holo1 7B GGUF
Apache-2.0
The Holo1-7B GGUF model is part of the Surfer-H system and is suitable for multimodal tasks such as visual document retrieval. It is particularly good at web page interaction and network monitoring, and can achieve high accuracy at a low cost.
Image-to-Text Transformers English
H
Mungert
663
0
Granite Vision 3.3 2b Embedding
Apache-2.0
An efficient embedding model built on granite-vision-3.3-2b, designed for multimodal document retrieval and capable of processing documents containing tables, charts, infographics, and complex layouts.
Multimodal Fusion Transformers English
G
ibm-granite
205
4
Colqwen2 2b V1.0
A visual retrieval model based on Qwen2-VL-2B-Instruct and ColBERT strategy, capable of generating multi-vector text and image representations
Text-to-Image Supports Multiple Languages
C
tsystems
700
1
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase